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GENETIC POLYMORPHISMS WHICH ARE Imaging studies have allowed examination of some ana- 

ASSOCIATED WITH AUTISM SPECTRUM lomicai characteristics in living autistic patients, providing 

DISORDERS larger samples than those available for histologic evaluation. 

In general, these confirm that the size of the brain in autistic 

The present application claims the benefit of U.S. Pro- 5 individuals is not reduced and that most regions are also 

visional Patent Application Ser. No. 60/049,803, filed Jun. oormal m size <?™ a > J - et Bio1 Psychiat., 31, 491 

17 1997 (1992)). Reports of size reductions in the brainstem have 

'The subject matter of this application was made with been inconsistent (Gaffney, G. R. et al., Biol. Psychiat., 24, 

support from the United States Government under Grants 578 ( 1988 > Hsu, M. et al., An*. Neurol. 48, 1160 (1991)), 

No. RO1AA08666, ROl NS 24287, R01HD34295, 10 but a new, larger study suggests that the midbrain, pons, and 

R01HD34969, and 2P30 ES01247 from the National Insti- medulla are smaller in autistic cases than in controls 

tutes of Health and Grant No. R824758 from the Environ- (Hashimoto, T. et al., /. Aut Dev. Disord, 25, 1 (1995)). In 

mental Protection Agency. The United States Government H ght of histological effects reported for the cerebellum, 

may retain certain rights. n k interesting that the one region repeatedly identified as 

!5 abnormal in imaging studies is the neocerebellar vermis 

FIELD OF THE INVENTION (lobules VI and VII; Gaffney, G. R. et al.,Am. /. Dis. Child., 

, n 141, 1330 (1987); Courchesne E., et al., N. Engi J. Med, 

Phe present invention relates to a method of screening 318j 1349 (198g) . Hashimoto, T. et al., J. Aut. Dev. Disord, 

subjects for genetic markers associated with autism. The 25> x (1995)) Nol ^ mmpiaisoxs& have found a differencc 

invention further relates to isolated nucleic acids having in neocerebellar size (Piven, J. et al., BioL Psychiat., 31, 491 

polymorphisms associated with auUsm the polypeptide (1992) . ^oman. M. D. et al., Neurology, 42, 753 (1992)), 

products of those nucleic adds, and antibodies specific to the bm a fecent reevalualion of sitive and live studies 

polypeptides produced by the mutated genes. (Courchesne, E. et al, Neurology, 44, 214 (1994)) indicates 

BACKGROUND OF THE INVENTION tbat a ^ ew aut i sl i c cases have hyperplasia of the neocerebel- 

25 lar vermis, while many have hypoplasia. Small samples of 

Autism is a behaviorally defined syndrome characterized this heterogeneous population could explain disparate 

by impairment of social interaction, deficiency or abnormal- results regarding the size of the ncocerebellum in autism, 

ityof speech development, and limited activities and interest The proposal that the cerebellum in autistic cases can be 

(American Psychiatric Association, 1994). The last category cither large or small is reasonable from an embryological 

includes such abnormal behaviors as fascination with spin- 30 standpoint, because injuries to the developing brain are 

ning objects, repetitive stereotypic movements, obsessive sometimes followed by rebounds of neurogenesis (e.g., 

interests, and abnormal aversion to change in the environ- Andreoli, J. et al., Aw. J. Anat. 137, 87 (1973); Bonn, M. C. 

ment. Symptoms are present by 30 months of age. The and Lauder, J. M.,Dev. Neurosci., 1, 250(1978); Bohn, M. 

prevalence rate in recent Canadian studies using total ascer- C, Neuroscience, 5, 2003 (1980)), and it is possible that 

tainment is over 1/1,000 (Bryson, S. E. et al., J. Child 35 such rebounds could overshoot the normal cell number. 

Psychol. Psychiat., 29, 433 (1988)). Further, because increased cell density has been observed in 

Attempts to identify the cause of the disease have been the limbic system, the cerebellum is not the only brain 

difficult, in part, because the symptoms do not suggest a region in which some form of overgrowth might account for 

brain region or system where injury would result in the the neuro-anatomy of autistic cases. It may well be that some 

diagnostic set of behaviors. Further, the nature of the behav- 40 autism-inducing injuries occur just prior to a period of rapid 

iors included in the criteria preclude an animal model of the growth for the cerebellar lobules in question or the limbic 

diagnostic symptoms and make it difficult to relate much of system, leading to excess growth, while other injuries con- 

the experimental literature on brain injuries to the symptoms tinue to be damaging during the period of rapid growth, 

of autism. leading to hypoplasia. However, the hypothesis that autism 

Several quantitative changes have been observed in autis- 45 occurs with both hypoplastic and hyperplastic cerebella calls 

tic brains at autopsy. An elevation of about 100 g in brain i nto question whether cerebellar anomalies play a major role 

weight has been reported (Bauman, M. L. and Kemper, T L., m autistic symptoms. 

Neurology 35, 866 (1985)). While attempts to find anatomi- A particularly instructive result has appeared in an MRI 

cal changes in the cerebral cortex have been unsuccessful study on the cerebral cortex (Piven, J. et al., Am. J. Psychiat., 

(Williams, R. S. et al., Arch. Neurol., 37, 749 (1980); 50 14, 734 (1992)). Of a small sample of autistic cases, the 

Coleman P. D., et al., /. Autism Dev. Disord., 15, 245 majority showed gyral anomalies (e.g., patches of 

(1985)), several brains have been found to have elevated pachygyria). However, the abnormal areas were not located 

neuron packing density in structures of the limbic system in the same regions from case to case. That is, while the 

(Bauman, M. L. and Kemper, T. L., Neurology 35, 866 functional symptoms were similar in all the subjects, the 

(1985)), including the amygdala, hippocampus, septal nuclei 55 brain damage observed was not. The investigators argue 

and mammillary body. Multiple cases in multiple labs have convincingly that the cortical anomalies were not respon- 

been found to have abnormalities of the cerebellum. A sible for the functional abnormalities. This is a central 

deficiency of Purkinje cell and granule cell number, as well problem in all attempts to screen for pathology in living 

as reduced cell counts in the deep nuclei of the cerebellum patients or in autopsy cases. While abnormalities may be 

and neuron shrinkage in the inferior olive, have been 60 present, it is not necessarily true that they are related to the 

reported (Bauman, M. L. and Kemper, T. L., Neurology, 35, symptoms of autism. 

866 (1985); Bauman, M. L. and Kemper, T. L., Neurology To teratologists, the physical anomalies of a neonate, 

36 (suppl. 1), 190 (1986); Bauman, M. L. and Kemper, T. L., child, or adult can serve as a guide to when the embryo was 

The Neurobiology of Autism, Johns Hopkins University injured. Years of research have amplified the details of that 

Press, 119 (1994); Ritvo, E. R. et al., Am. J. Psychiat., 143, 65 timetable for the nervous system (Rodier, P. M.,Dev. Med. 

862 (1986); Kemper, T. L. and Bauman M. L., Neurobiology Child Neurol., 22, 525 (1980); Bayer, S. A. et al., 

of Infantile Autism, Elsevier Science Publishers, 43 (1992)). Neurotaxicology, 14, 83 (1993)). In the case of autism, lack 
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of specific information on the neuroanatomy associated with that one failed to form is conclusive in pinpointing the time 

the disease has made it difficult to estimate the stage of of injury. Like the thalidomide cases, the autopsy case could 

development when the disorder arises. However, in 1993, have been injured only at the time of neural tube closure. 

Miller and Stromland reported a finding that conclusively The effect of injury around neural tube closure has been 

identified the time of origin for some cases. They observed 5 tested experimentally, to see whether it can produce ana- 

that the rate of autism was 33% in people exposed to tomical results like those suspected in the thalidomide cases 

thalidomide between the 20th and 24th days of gestation, and observed in human brain. Animals exposed during the 

and 0% in cases exposed at other times (Stromland, ICetal., critical period to valproic acid, a teratogen with effects 

Devel Med Child. Neurol, 36, 351 (1994)). Their deduc- similar to thalidomide, which has also been associated with 

tion regarding the time of injury was not based on 10 autism (Christianson, A. L. et a!.,. Devel. Med. Child. 

neuroanatomy, which was not known in their living subjects. Neurol., 36, 357 (1994); Williams, P. G. et al., Dev. Med. 

Instead, it was based on the external stigmata of the cases. Child. Neurol., 39, 632 (1997)) exhibit reductions in the 

Because thousands of thalidomide-exposed offspring number of cranial nerve motor neurons (Rodier, P. M. et al., 

have been evaluated for somatic malformations, the array of J. Comp. Neurol., 370, 247 (1996)). They are distinguished 

injuries associated with the drug is well-known, and the time 15 from controls by shortening of the bindbrain in the region 

when each arises has been carefully defined (Miller, M. T, which forms from the fifth rhombomere, just as the autop- 

Trans. Am. Ophthalmol. Soc., 89, 623 (1991)). Of five cases sied brain was (Rodier, P. M., et al., Teratology 55, 319 

of thalidomide-induced autism, four had malformations of (1997)). Additional data suggests that the animal model has 

the ears, without limb malformation, and the fifth had secondary changes in the cerebellum like those reported in 

malformation of the ears, fore limb, and hindlimb. Thalido- 2 q 5011,6 human cases of autism (Ingram, J. L. et al., Teratology, 

mide is not teratogenic before the 20th day of gestation. 53, 86 (1996)). 

Starting on day 20 exposure causes ear malformation and It has long been known that heritable factors play an 

abnormalities of the thumb. Limb malformations (other than important role in the etiology of autism. This was demon- 

those of the thumb) first appear with exposure on the 25th strated by the original twin studies of Folstein and Rutter (/. 

day, with effects moving from the forelimb to the hindlimb 2 $ Child Psychol. Psychiat., 18, 297 (1977)) and the subse- 

as exposure occurs at later stages. After the 35th day, qucnt addition of more twin pairs to the sample has only 

thalidomide produces no malformations. Thus, the cases increased the estimate of the proportion of cases suspected 

with malformations restricted to the ear must have been to have a genetic basis (e.g. Bailey, A. et al., Psychol. Med, 

exposed before day 25, and the one patient with multiple 25, 63 (1995); LeCouteur, A. et al., J. Child Psychol. 

malformations can only be explained as a case of repeated 30 Psychiat., 37, 785 (1996)). Family studies of siblings 

injuries at several stages of development. (Smalley, S. L. et al., Arch. Gen. Psychiat., 45, 953 (1988)) 

In fact, the idea that autism might arise very early in and parents (Landa, R. et al., J. Speech Hear. Res., 34,1339 

gestation was suggested long ago. Steg and Rapoport (J.Aut. (1991); Landa, R. el al., Psych. Med., 22, 245 (1992)) also 

Child. Sch 'iz., 5, 299 (1975)) noted the significant increase in support the conclusion that an inherited risk is involved in 

minor physical anomalies among children with autism, and 35 many, perhaps all, cases of autism spectrum disorders. While 

realized that they indicated an injury in the first trimester. the rate of autism is elevated in close relatives of cases, the 

Several studies of minor malformations have found ear rate of symptoms short of the diagnosis is increased much 

effects to be the most common anomalies in autism (Walker, more. That is, individuals known to share genetic factors 

H. A., J. Aut. Child Schiz., 7, 165 (1977); Campbell, M. et seem to vary in the degree to which symptoms are 

&L,Am. J. Psychiat., 135, 573 (1978)), and the most recent 40 expressed. This non-Mendelian pattern (Jorde, L. B. et al., 

study shows that they are not only the best discriminator Am. J. Hum. Genet., 49, 932 (1991)) suggests a complex 

between people with autism and normal controls, but also disorder with major contributions from predisposing genetic 

the only anomaly that discriminates autism from other factors, which interact with the overall genetic background 

developmental disabilities (Rodier, P. M. et al., Teratology and/or environmental insults to determine the phenotype. 

55, 319 (1997)). Ear anomalies arc among the earliest of all 45 The ability to identify the genetic factors that increase the 

minor physical malformations in their time of origin. risk for autism would be a breakthrough for genetic coun- 

External malformations are not the only evidence which seling for prevention of the disorder. In addition, it would 

puts the time of injury in autism at the time of neural tube allow the creation of genetically-engineered animals in 

closure. The cranial nerve dysfunctions observed in the which to study the environmental factors that interact with 

patients with autism secondary to thalidomide exposure — 50 the inherited predispositions. Tests for genetic factors would 

facial nerve palsy, Duanc syndrome (lack of abducens also serve as biomarkers, valuable for diagnosis, and useful 

innervation with reinnervation of the lateral rectus by the in research on all aspects of the autism spectrum, 

oculomotor nerve), abnormal lacrimation, gaze paresis, and Unfortunately, neither linkage nor association studies have 

hearing deficits (Stromland, K. et al., Devel. Med. Child. revealed any chromosomal regions strongly related to 

Neurol, 36, 351 (1994))— suggest that the earliest-forming 55 autism (e.g. Spcnce, M. A. et al., Behav. Genet., 15, 1 

structures of the brain stem were damaged, and it is now (1985); Smalley, S. L. et zl.,Arch. Gen. Psychiat., 45, 953 

known that these form during neural rube closure (Bayer, S. (1988); Cook, E. H. et al., Molec. Psychiat., 2, 247 (1997); 

A. et al., Neurotoxicology, 14, 83 (1993)). Subsequent Klauck, S. M. et al., Hum. Molec. Genet., 6, 2233 (1997); 

studies have shown that a human brain from a patient with Cook, E. H. et al., Am. J. Hum. Genet., 62, 1077 (1998)). 

autism has the same pattern of brain stem injury predicted by 60 Furthermore, while there is no known medical treatment 

the thalidomide cases (Rodier, P. M. et al., J. Comp. Neurol, for autism, some success has been reported for early inter- 

370, 247 (1996)). Perhaps even more importantly, the autop- vention with behavioral therapies. A biomarker would allow 

sied brain has a shortening of the brain stem in the region of identification of the disease, now typically diagnosed 

the fifth rhombomere, and is missing two of the nuclei between ages three and five, in infancy or prenatal life. Thus, 

known to form from that embryological structure. The 65 there is an urgient need for a method of reliably identifying 

rhombomeres exist so briefly (Streeter, G. L, Contr. subjects with autism. In particular there is need for a blood 

Embryol Cameg. Instn., 30,213 (1948)) that the evidence test for polymorphisms causing autism spectrum disorders. 
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Families with affected members need to know whether they carried out on 30-40 subjects to be certain that the digestion 

carry a mutation which could affect future pregnancies. results match the sequencing results, demonstrating that the 

Clinicians need a test as an aid in diagnosis, and researchers digestion procedure detects the deviant sequence described 

would use the test to classify subjects according to the and no other, 
etiology of their disease. 5 

, , WA _ _ , in7riIfrirtVI DETAILED DESCRIPTION OF THE 

SUMMARY OF THE INVENTION INVENTION 

The present invention relates to a method for screening ... „ 

subjects for genetic markers associated with autism. A ™» P resent invention provides a method for screening 

biological sample is isolated from a mammal and then tested ™ sub J ects for S coetic raarkers associated with autism. A 

for the presence of a mutated gene or a product thereof biological sample is isolated from a mammal and then tested 

which is associated with autism. for ^ presence of a mutated gene or a product thereof 

Another aspect of the invention is an isolated nucleic acid which 45 ^ autism - 

encoding a HoxAl allele having a polymorphism which is Polymorphisms in Hox genes are shown to be associated 

associated with autism spectrum disorders. 15 with autism spectrum disorders. The Hox genes are a family 

Yet another aspect of the invention is an isolated nucleic of g enes mat function in the patterning of body structures 
acid encoding a HoxBl allele having a polymorphism which develop along an anteroposterior axis, such as the limbs, 

is associated with autism spectrum disorders. skeleton, and nervous system; they are expressed during 

embryonic development at specific times in limited regions 

BRIEF DESCRIPTION OF THE DRAWINGS 20 of the embryo. In the mouse, for example, Hox-al is 

FIG. 1 shows two different alleles of HoxAl from a case expressed in rhombomeres 4 through 8 of the developing 

of autism spectrum disorder. FIG. 1A shows the previously hindbrain on days 8 to 8.5 of gestation. The Hox genes 

published sequence of wild-type HoxAl. FIG. IB shows a COQtro1 tbe P attern formation of the hwdbrain. Similar 

previously unknown polymorphism having a single base „ abnormalities have been observed in the brains of autistic 

substitution at position 218, where an A is changed to a G. individuals (Rodier et al.,7. Camp. Neuro. 370, 247 (1996), 

FIG. 2 shows a second polymorphism was identified in wmch 15 herebv «"»«Porated by reference), 
the first exon of HoxBl. The published sequence of wild- The DNA and amino acid sequences for HoxA-1 have 

type HoxBl (FIG. 2 A) is compared to the previously previously been reported (Acampora, D.etal.,M*cfeicAc«fr 

unknown polymorphism in this paralog of HoxAl (FIG. 30 Res., 17, 10385 (1989); Hong, Y. et al., Gene, 159, 209 

2B). In this case, the anomaly is a nine-base insertion that (1995) which are hereby incorporated by reference). Exon 1 

adds a third repeat where two are normally present. The stretches from base 1 to base 357. Exon 2 stretches from 

result is three extra amino acids, (serine-alanine-histidine). base 358 to the end (1008). The wildtype gene sequences for 

For each of the polymorphisms, it was possible to test for the HoxAl is provided in SEQ. ID. No. 1 as follows: 



ATGGACAATG 


CAAGAATGAA CTCCTTCCTG 


GAATACCCCA TACTTAGCAG 


TGGCGACTCG 


60 


GGGACCTGCT 


CAGCCCGAGC CTACCCCTCG GACCATAGGA TTACAACTTT 


CCAGTCGTGC 


120 


GCGGTCAGCG 


CCAACAGTTG CGGCGGCGAC 


GACCGCTTCC TAGTGGGCAG 


GGGGGTGCAG 


180 


ATCGGTTCGC 


CCCACCACCA CCACCACCAC 


CACCATCACC ACCCCCAGCC 


GGCTACCTAC 


240 


CAGACTTCCG 


GGAACCTGGG GGTGTCCTAC TCCCACTCAA GTTGTGGTCC AAGCTATGGC 


300 


TCACAGAACT 


TCAGTGCGCC TTACAGCCCC 


TACGCGTTAA ATCAGGAAGC 


AGACGTAAGT 


360 


GGTGGGTACC 


CCCAGTGCGC TCCCGCTGTT 


TACTCTGGAA ATCTCTCATC 


TCCCATGGTC 


420 


CAGCATCACC 


ACCACCACCA GGGTTATGCT 


GGGGGCGCGG TGGGCTCGCC 


TCAATACATT 


480 


CACCACTCAT 


ATGGACAGGA GCACCAGAGC 


CTGGCCCTGG CTACGTATAA TAACTCCTTG 


540 


TCCCCTCTCC 


ACGCCAGCCA CCAAGAAGCC 


TGTCGCTCCC CCGCATCGGA 


GACATCrrCT 


600 


CCAGCGCAGA 


CTTTTGACTG GATGAAAGTC 


AAAAGAAACC CTCCCAAAAC 


AGGGAAAGTT 


660 


GGAGAGTACG 


GCTACCTGGG TCAACCCAAC 


GCGGTGCGCA CCAACTTCAC 


TACCAAGCAG 


720 


CTCACGGAAC 


TGGAGAAGGA GTTCCACTTC 


AACAAGTACC TGACGCGCGC 


CCGCAGGGTG 


780 


GAGATCGCTG 


CATCCCTGCA GCTCAACGAG 


ACCCAAGTGA AGATCTGGTT 


CCAGAACCGC 


840 


CGAATGAAGC 


AAAAGAAACG TGAGAAGGAG 


GGTCTCTTGC CCATCTCTCC 


GGCCACCCCG 


900 


CCAGGAAACG 


ACGAGAAGGC CGAGGAATCC 


TCAGAGAAGT CCAGCTCTTC 


GCCCTGCGTT 


960 


CCTTCCCCGG 


GGTCTTCTAC CTCAGACACT 


CTGACTACCT CCCACTGA 




1008 



presence of the allele different from the known sequence by 65 The nucleic acid molecule of SEQ. ID. No. 1 encodes a 
digesting PCR product with a restriction enzyme (Hph-I for polypeptide having the amino acid sequence of SEQ. ID. 
HoxAl and Msp-I for HoxBl). Sequencing reactions were No. 2, as follows: 
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A polymorphism in the HocAl gene has been isolated and 
sequenced. This polymorphism is associated with autism 
spectrum disorders. A single base substitution is located at 
position 218 (underlined) of SEQ. ID. No. 3, where an A is 
changed to a G, as follows: 



The single base substitution at position 218 results in the 
replacement of histidine with arginine (underlined). The 
resulting protein has the amino acid sequence (SEQ. ID. No. 
4) as follows: 



A.TGGACAATG CAAGAATGAA CTCCTTCCTG GAATACCCCA TACTTAGCAG TGGCGACTCG 60 

GGGACCTGCT CAGCCCGAGC CTACCCCTCG GACCATAGGA TTACAACTTT CCAGTCGTGC 120 

GCGGTCAGCG CCAACAGTTG CGGCGGCGAC GACCGCTTCC TAGTGGGCAG GGGGGTGCAG 180 

ATCGGTTCGC CCCACCACCA CCACCACCAC CACCATCfiCC ACCCCCAGCC GGCTACCTAC 240 

CAGACTTCCG GGAACCTGGG GGTGTCCTAC TCCCACTCAA GTTGTGGTCC AAGCTATGGC 300 

TCACAGAACT TCAGTGCGCC TTACAGCCCC TACGCGTTAA ATCAGGAAGC AGACGTAAGT 3 60 

GGTGGGTACC CCCAGTGCGC TCCCGCTGTT TACTCTGGAA ATCTCTCATC TCCCATGGTC 4 20 

CAGCATCACC ACCACCACCA GGGTTATGCT GGGGGCGCGG TGGGCTCGCC TCAATACATT 480 

CACCACTCAT ATGGACAGGA GCACCAGAGC CTGGCCCTGG CTACGTATAA TAACTCCTTG 540 

TCCCCTCTCC ACGCCAGCCA CCAAGAAGCC TGTCGCTCCC CCGCATC GG A GACATCTTCT 600 

CCAGCGCAGA CTTTTGACTG GATGAAAGTC AAAAGAAACC CTCCCAAAAC AGGGAAAGTT 660 

GGAGAGTACG GCTACCTGGG TCAACCCAAC GCGGTGCGCA CCAACTTCAC TACCAAGCAG 7 20 

CTCACGGAAC TGGAGAAGGA GTTCCACTTC AACAAGTACC TGACGCGCGC CCGCAGGGTG 780 

GAGATCGCTG CATCCCTGCA GCTCAACGAG ACCCAAGTGA AGATCTGGTT CCAGAACCGC 840 

CGAATGAAGC AAAAGAAACG TGAGAAGGAG GGTCTCTTGC CCATCTCTCC GGCCACCCCG 900 

CCAGGAAACG ACGAGAAGGC CGAGGAATCC TCAGAGAAGT CCAGCTCTTC GCCCTGCGTT 960 

CCTTCCCCGG GGTCTTCTAC CTCAGACACT CTGACTACCT CCCACTGA 1008 
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In addition to the polymorphism in HoxAl, a polymor- 
phism associated with autism spectrum disorders has been 
isolated and sequenced from the HoxBl gene. The Hoxbl 
gene has not been studied as comprehensively as Hoxal in 
transoenic knockouts, but is expressed at the same stage 
(Murphy, P et al., Development, 111, 61 (1991), which is 
hereby incorporated by reference). Its null mutation pro- 
duces similar malformations, including severe diminution of 
the facial nucleus (Goddard, J. M. ct al., Development, 122, 
3217 (1996), which is hereby incorporated by reference). 



The similarity of expression and function of these two genes 
is due to the fact that they were originally a single gene in 
invertebrates (Ruddle, F. H. et &L,Annu. Rev. Genet., 28, 423 
(1993), which is hereby incorporated by reference). In 
mammals, the two appear on separate chromosomes (human 
7 and 17), but the sequence of each of the mammalian genes 
is similar to the others, and similar to the original single gene 
from which the two mammalian loci arose. The sequence of 
the wildtype hoxBl gene (SEQ. ID. No. 5) follows: 



TGACGCATGG ACTATAATAG GATGAACTCC TTCTTAGAGT ACCCACTCTG TAACCGGGGA 60 

CCCAGCGCCT ACAGCGCCCA CAGCGCCCCA ACCTCCTTTC CCCCAAGCTC GGCTCAGGCG 120 

GTTGACAGCT ATGCAAGCGA GGGCCGCTAC GGTGGGGGGC TGTCCAGCCC TGCGTTTCAG 180 

CAGAACTCCG GCTATCCCGC CCAGCAGCCG CCTTCGACCC TGGGGGTGCC CTTCCCCAGC 240 

TCCGCGCCCT CGGGGTATGC TCCTGCCGCC TGCAGCCCCA GCTACGGGCC TTCTCAGTAC 300 

TACCCTCTGG GTCAATCAGA AG GAG AC GG A GGCTATTTTC ATCCCTCGAG CTACGGGGCC 360 

CAGCTAGGGG GCTTGTCCGA TGGCTACGGA GCAGGTGGAG CCGGTCCGGG GCCATATCCT 420 

CCGCAGCATC CCCCTTATGG GAACGAGCAG ACCGCGAGCT TTGGACCGGC CTATGCTGAT 480 

CTCCTCTCCG AGGACAAGGA AACACCCTGC CCTTCAGAAC CTAACACCCC CACGGCCCGG 540 

ACCTTCGACT GGATGAAGGT TAAGAGAAAC CCACCCAAGA CAGCGAAGGT GTCAGAGCCA 600 

GGCCTGGGCT CGCCCAGTGG CCTCCGCACC AACTTCACCA CAAGGCAGCT GACAGAACTG 660 

GAAAAGGAGT TCCATTTCAA CAAGTACCTG AGCCGGGCCC GGAGGGTGGA GATTGCCGCC 720 

ACCCTGGAGC TCAATGAAAC ACAGGTCAAG ATTTGGTTCC AGAACCGACG AATGAAGCAG 7B0 

AAGAAGCGCG AGCGAGAGGG AGGTCGGGTC CCCCCAGCCC CACCAGGCTG CCCCAAGGAG 840 

GCAGGTGGAG ATGCCTCAGA CCAGTCGACA TGCACCTCCC CGGAAGCCTC ACCCAGCTCT 900 

GTCACCTCCT GAACTGAACC TAGCCACCAA TGGGGCTTCC AGGCACTGGA GCGCCCCAGT 960 

CCAGCCCTAT CCCAGGCTCT CCCAACCCAG GCCTGGCTTC ACTGCCTGGG ATCTCTAGGC 1020 

T 1021 
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The protein encoded by nucleotides 7 to 909 of the insertion is such that the amino acid sequence also changes, 
wild-type HoxBl gene (SEQ. ID. No. 6) is as follows: The normal sequence reads . . . serine -alanine-histidine- 
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As with the HoxAl gene, polymorphisms associated with 
autism spectrum disorders were found with HoxBl. The 
HoxBl mutation occurs after base 88 (C) with the insertion 
of nine nucleotides (ACAGCGCCC). The location of this 



serine-alanine-proline. The mutant sequence has an extra 
serine-alanine-histidine-sequence and then the sequence 
resumes normally. The insertion and altered amino acid 
sequence are underlined below. A mutated form of HoxBl 
(SEQ. ID. No. 7) is depicted as follows: 



TGACGCATGG ACTATAATAG GATGAACTCC TTCTTAGAGT ACCCACTCTG TAACCGGGGA 60 

CCCAGCGCCT ACAGCGCCCA CAGCGCCCAC AGCGCCC CAA CCTCCTTTCC CCCAAGCTCG 120 

GCTCAGGCGG TTGACAGCTA TGCAAGCGAG GGCCGCTACG GTGGGGGGCT GTCCAGCCCT 180 

GCGTTTCAGC AGAACTCCGG CTATCCCGCC CAGCAGCCGC CTTCGACCCT GGGGGTGCCC 240 

TTCCCCAGCT CCGCGCCCTC GGGGTATGCT CCTGCCGCCT GCAGCCCCAG CTACGGGCCT 300 

TCTCAGT AC T ACCCTCTGGG TCAATCAGAA GGAGACGGAG GCTATTTTCA TCCCTCGAGC 360 

TACGGGGCCC AGCTAGGGGG CTTGTCCGAT GGCTACGGAG CAGGTGGAGC CGGTCCGGGG 420 

CCATATCCTC CGCAGCATCC CCCTTATGGG AACGAGCAGA CCGCGAGCTT TGCACCGGCC 480 

TATGCTGATC TCCTCTCCGA GGACAAGGAA ACACCCTGCC CTTCAGAACC TAACACCCCC 540 

ACGGC CCGG A CCTTCGACTG GATGAAGGTT AAGAGAAACC CACCCAAGAC AGCGAAGGTG 600 

TCAGAGCCAG GCCTGGGCTC GCCCAGTGGC CTCCGCACCA ACTTCACCAC AAGGCAGCTG 660 

ACAGAACTGG AAAAGGAGTT CCATTTCAAC AAGTACCTGA GCCGGGCCCG GAGGGTGGAG 720 

ATTGCCGCCA CCCTGGAGCT CAATGAAACA CAGGTCAAGA TTTGGTTCCA GAACCGACGA 780 

ATGAAGCAGA AGAAGCGCGA GCGAGAGGGA GGTCGGGTCC CCCCAGCCCC ACCAGGCTGC 84 0 

CCCAAGGAGG CAGCTGGAGA TGCCTCAGAC CAGTCGACAT GCACCTCCCC GGAAGCCTCA 900 

CCCAGCTCTG TCACCTCCTG AACTGAACCT AGCCACCAAT GGGGCTTCCA GGCACTGGAG 960 

CGCCCCAGTC CAGCCCTATC CCAGGCTCTC CCAACCCAGG CCTGGCTTCA CTGCCTGGGA 1020 

TCTCTAGGCT 1030 
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The protein encoded by SEQ. ID. No. 8 is as follows: 
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Genes which have been duplicated and then maintained 
similar functions over the course of evolution are called 
"paralogs." A third paralog derived from the same inverte- 
brate gene is known as HoxDl. This gene has not yet been 
studied in knockouts, but is known to have evolved to be 
expressed in somewhat different embryonic tissues 
(mesoderm vs. ectoderm) in the hindbrain region at the same 
stage of development as Hoxal and Hoxbl. Thus preferred 
box genes include HoxAl, HoxBl, and HoxDl. 

Biological samples suitable for testing include blood, 
saliva, amniotic fluid, and tissue. The most preferred bio- 
logical sample is blood. However, any biological sample 
from which genetic material or the products of the marker 
genes can be isolated is suitable. 

Because the Hox genes are highly conserved among 
species, the present invention is applicable for screening for 
autism related polymorphisms in mammals. The screening 
method can be utilized to identify animals carrying defects 
in genes like those which give rise to autism in humans in 
order to study the progression of the disease and test 
treatments. However, the preferred mammal to be screened 
is humans. In particular, the biological samples are isolated 
from developmentally disabled children or adults in order to 
determine whether they carry the marker associated with 
autism to assist in diagnosing the disease. Similarly, the 
parents or relatives of disabled children may be screened to 
determine whether they are carriers of the mutated gene. 
Samples may also be tested from children including infants 
to identify those children who have genetic markers asso- 
ciated with autism in order to provide them with early 
behavior training. 

As discussed more fully in the examples, polymorphisms 
in the HoxAl gene are associated with autism spectrum 
disorders. In addition to HoxAl, the HoxBl and HoxDl 
genes are also involved in the same stages of early brain 
development. Hoxbl and Hoxdl are related developmental 
genes which arc expressed at the same time and in approxi- 
mately the same region of the embryo as Hoxal. The Hox 
genes are closely related and may perform similar functions 
in development. Evolutionarily the various Hox genes were 
probably derived from a common ancestral gene. Thus, the 
preferred genes to be screened include Hoxal, Hoxbl, and 
Hoxdl. 



The mutation in the mutated gene may be a single base 

25 substitution mutation resulting in an amino acid substitution, 
a single base substitution mutation resulting in a transla- 
tional stop, an insertion mutation, a deletion mutation, or a 
gene rearrangement. As demonstrated from the identified 
polymorphisms in HoxAl and HoxBl, polymorphisms 

30 which disrupt the gene or result in an altered peptide are 
associated with autism spectrum disorders. 

The mutation may be located in an intron, an exon of the 
gene, or a promotor or other regulatory region which affects 

35 the expression of the gene. 

Methods for screening for mutated nucleic acids include 
direct sequencing of nucleic acids, single strand polymor- 
phism assay, ligase chain reaction, enzymatic cleavage, and 
southern hybridization. 

40 Screening for mutated nucleic acids can be accomplished 
by direct sequencing of nucleic acids. In fact, putative 
mutants identified by other methods may be sequenced to 
determine the exact nature of the mutation. Nucleic acid 
sequences can be determined through a number of different 

4 5 techniques which are well known to those skilled in the art. 
In order to sequence the nucleic acid, sufficient copies of the 
material must first be amplified. 

Amplification of a selected, or target, nucleic acid 
sequence may be carried out by any suitable means. (See 

50 generally Kwoh, D. and Kwoh, T.,Am Biotechnol Lab, 8, 14 
(1990), which is hereby incorporated by reference.) 
Examples of suitable amplification techniques include, but 
are not limited to, polymerase chain reaction, ligase chain 
reaction (see Barany, Proc Natl Acad Sci USA 88, 189 

55 (1991), which is hereby incorporated by reference), strand 
displacement amplification (see generally Walker, G. et al., 
Nucleic Acids Res. 20, 1691 (1992); Walker. G. et al., Proc 
Natl Acad Sci USA 89, 392 (1992), which are hereby 
incorporated by reference), transcription-based amplifica- 

60 tion (see Kwoh, D. et al., Proc Natl Acad Sci USA , 86, 1173 
(1989), which is hereby incorporated by reference), self- 
sustained sequence replication (or "3SR") (see GuateUi, J. et 
al., Proc Natl Acad Sci USA , 87, 1874 (1990), which is 
hereby incorporated by reference), the QP replicase system 

65 (see Lizardi, P. et al., Biotechnology, 6, 1197 (1988), which 
is hereby incorporated by reference), nucleic acid sequence- 
based amplification (or "NASBA") (see Lewis, R., Genetic 
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Engineering News, 12(9), 1 (1992), which is hereby incor- Southern hybridization is also an effective method of 
porated by reference), the repair chain reaction (or "RCR") identifying differences in sequences. Hybridization 
(see Lewis, R., Genetic Engineering News, 12(9), 1 (1992), conditions, such as salt concentration and temperature can 
which is hereby incorporated by reference), and boomerang be adjusted for the sequence to be screened. Southern 
DNA amplification (or "BDA") (see Lewis, R., Genetic 5 blotting and hybridizations protocols are described in Cur- 
Engineering News, 12(9), 1 (1992), which is hereby incor- Knt Protocols in Molecular Biology (Greene Publishing 
porated by reference). Polymerase chain reaction is cur- Associates and Wiley-Interscience), pages 2.9.1-2.9.10. 
rentlv preferred Probes can be labelled for hybridization with random oli- 
• i .AM* ir. . l • . ■_ gomers (primarily 9-mers) and the Klenow fragment of 
In general, DNA amplification techniques such as the DNA polymerase, \fery high specific activity probe can be 
foregomgmvolvemeuseoa " oblained ^ c^rdaHy available kits such as the 
pairs of probes which specmcaUy bind to DNA encoding the Read Xo _ Go DNA Ubdlin J Beads (Pharmacia Biotech)> 
gene of interest, but do not bind to DNA which does not foUowiD ^ manufacnjrer ^ ^ Briefl ^ ^ 
encode the gene, under me same hybridization conditions, DNA (probe) is labelled with 32 P<1CTP in a 15 minute 
and which serve as the primer or primers for the amplifica- mcubation at 3? o c UbeUed be h then rffied over , 
tion of the gene of interest or a portion thereof m the » Chn)maSpin (clonlech) nuclc f c add purific F ation ^mn. 
amp ca ion reac ion. Possible competition of probes having high repeat sequence 
Nucleic acid sequencing can be performed by chemical or content, and stringency of hybridization and washdown will 
enzymatic methods. The enzymatic method relies on the be determined individually for each probe used, 
ability of DNA polymerase to extend a primer, hybridized to Alternatively, fragments of a candidate gene may be gener- 
the template to be sequenced, until a chain-terminating 20 ated by PC R, the specificity may be verified using a rodent- 
nucleotide is incorporated. The most common methods human somatic cell hybrid panel, and subcloning the frag- 
utilize didoexynucleotides. Primers may be labelled with ment . allows for a large prep for sequencing and use as 
radioactive or fluorescent labels. Various DNA polymerases a pro be. Once a given gene fragment has been characterized, 
are available including Klenow fragment, AMV reverse small probe preps can be done by gel- or column-purifying 
transcriptase, Thermus aquaticus DNA polymerase, and me pcR P roduct. 

modified T7 polymerase. These mismatch detection protocols use samples gener- 
Although DNA sequencing is clearly the most sensitive ated by PCR and thus require use of very little genomic 
and informative method, it is too cumbersome for routine template. All of these methods can provide very good clues 
use in searching for polymorphisms, especially when the 3Q regarding the location of the sequence change which leads to 
DNA segment of interest is large. Several other methods are the appearance of anomalous bands, hence facilitating sub- 
available for a rapid search for changes in autism associated sequent cloning and sequencing strategies. 



Methods of screening for mutated nucleic acids can be 

Recently, single strand polymorphism assay ("SSPA") carried out using either deoxyribonucleic acids ("DNA") or 

analysis and the closely related heteroduplex analysis meth- 35 messenger ribonucleic acids ("mRNA") isolated from the 

ods have come into use as effective methods for screening biological sample. During periods when the gene is 

for single-base polymorphisms (Orita, M. et al., Proc Natl expressed, mRNA may be abundant and more readily 

Acad Sci USA, 86, 2766 (1989), which is hereby incorpo- detected. However, these genes are temporally controlled 

rated by reference). In these methods, the mobility of and, at most stages of development, the preferred material 

PCR -amplified test DNA from clinical specimens is com- 40 f° r screening is DNA 

pared with the mobility of DNA amplified from normal Alternatively, the detection of a mutated gene associated 

sources by direct electrophoresis of samples in adjacent with autism can be carried out by collecting a biological 

lanes of native polyacrylamide or other types of matrix gels. sample and testing for the presence or form of the protein 

Single-base changes often alter the secondary structure of produced by the gene. The mutation in the gene may result 

the molecule sufficiently to cause slight mobility differences 45 in the production of a mutated form of the peptide or the lack 

between the normal and mutant PCR products after pro- Q f production of the gene product. In this embodiment, the 

longed electrophoresis. determination of the presence of the polymorphic form of 

Ligase chain reaction is yet another recently developed the protein can be carried out, for example, by isoelectric 

method of screening for mutated nucleic acids. Ligase chain focusing, protein sizing, or immunoassay. In an 

reaction (LCR) is also carried out in accordance with known 50 immunoassay, an antibody that selectively binds to the 

techniques. LCR is especially useful to amplify, and thereby mutated protein can be utilized (for example, an antibody 

detect, single nucleotide differences between two DNA that selectively binds to the mutated form of HoxAl 

samples. In general, the reaction is called out with two pairs encoded protein). Such methods for isoelectric focusing and 

of oligonucleotide probes: one pair binds to one strand of the immunoassay are well known in the art, and are discussed in 

sequence to be detected; the other pair binds to the other 55 further detail below. 

strand of the sequence to be detected. The reaction is carried Changes in the size or charge of the polypeptide can be 
out by, first, denaturing (e.g., separating) the strands of the identified by isoelectric focusing or protein sizing tech- 
sequence to be detected, then reacting the strands with the niques. Changes resulting in amino acid substitutions, where 
two pairs of oligonucleotide probes in the presence of a heat the substituted amino acid has a different charge than the 
stable ligase so that each pair of oligonucleotide probes 60 original amino acid, can be detected by isoelectric focusing, 
hybridize to target DNA and, if there is perfect complemen- Isoelectric focusing of the polypeptide through a gel having 
tarity at their junction, adjacent probes are ligated together. an ampholine gradient at high voltages separates proteins by 
The hybridized molecules are then separated under denatur- their pi. The pll gradient gel can be compared to a simul- 
ation conditions. The process is cyclically repeated until the taneously run gel containing the wild-type protein. Protein 
sequence has been amplified to the desired degree. Detection 65 sizing techniques such as protein electrophoresis and sizing 
may then be carried out in a manner like that described chromatography can also be used to detect changes in the 
above with respect to PCR. size of the product. 
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As an alternative to isoelectric focusing or protein sizing, Those skilled in the art will be familiar with numerous 

the step of determining the presence of the mutated polypep- specific immunoassay formats and variations thereof which 

tides in a sample may be carried out by an antibody assay may be useful for carrying out the method disclosed herein, 

with an antibody which selectively binds to the mutated See U.S. Pat. Nos. 4,727,022, 4,659,678, 4,376,110, 4,275, 

polypeptides (i.e., an antibody which binds to the mutated 5 149, 4,233,402, and 4,230,767. 

polypeptides but exhibits essentially no binding to the A * ... '.. , ™ ^ 
wildVpe polypeptide without the polymorphism in the . Antibodies which sclecUvely bin a polymorphic DLST 
same binding conditions). J° form ma y be conjugated to a solid support sui able for a 
Antibodies used to bind selectively the products of the diagnostic assay (e^g., beads, plates, slides or wells formed 
mutated genes can be produced by any suitable technique. 10 fro f m malenal t s s " ch 35 lalex ° r P°^yrene) m accordance 
For example, monoclonal antibodies may be produced ma 10 ^^J"? tcc ^ u « » P^talioo Antibodies 
hybridoma cell line according to the techniques of Kohler whlch ""J f ^l™ Ti^ST * 
and Milstein, Nature, 265, 495 (1975), which is hereby g^ffiS deteCtabl f ST SU ? as "diolabels (e.g., 
incorporated by reference. A hybridoma is an immortalized * , . 1 J>, enzyme labels (e g., horseradish peroxidase, 
cell line which is capable of secreting a specific monoclonal 1S alkaline phosphatase), and fluorescent labels (e.g., 
antibody. The mutated products of genes which are associ- 15 fluorescein ) m accordance with known techniques, 
ated with autism may be obtained from a human patient, ^ invention further provides an isolated nucleic acid 
purified, and used as the immunogen for the production of molecule which encodes a HoxAl gene havmg a single base 
monoclonal or polyclonal antibodies. Purified polypeptides substitution at nucleotide 218 in SEQ. ID. No. 1. In another 
may be produced by recombinant means to express a bio- 20 embodiment, the invention provides an isolated nucleic acid 
logically active isoform, or even an immunogenic fragment molecule which encodes a HoxBl gene having an insertion 
thereof may be used as an immunogen. Monoclonal Fab between positions nucleotides 88 and 89 in SEQ. ID. No. 5. 
fragments may be produced in Escherichia coli from the In addition, the invention provides fragments of the HoxAl 
known sequences by recombinant techniques known to and HoxBl genes having the polymorphism, where the 
those skilled in the art. (See, e.g., Huse, W., Science 246, « fra g ment bas at least 15 nucleotides and encompasses the 
1275 (1989), which is hereby incorporated by reference) polymorphism, i.e., the single base substitution. Fragments 
(recombinant Fab techniques) longer than 15 nucleotides can be used to probe for nucleic 
The term "antibodies" as used herein refers to all types of acid moIecu i | es containing the polymorphism. Longer frag- 
immunoglobulin, including IgG, IgM, IgA, IgD, and IgE. menls m ^ ^ ^ at higher stringency conditions. 
The antibodies may be monoclonal or polyclonal and may 30 ^ invention also provides isolated polypeptides that are 
be of any species of origin, including (for example) mouse, encoded by the genes having the polymorphisms. Either the 
rat, rabbit, horse, or human, or may be chimeric antibodies, whole protein or fragments thereof may be used to induce 
and include antibody fragments such as, for example, Fab, the production of antibodies specific to the portion of the 
F(ab , ) 2 . and Fv fragments, and the corresponding fragments protein which is effected by the polymorphism. Such anti- 
obtained from antibodies other than IgG. 35 bodies may then be used to detect the presence of a poly- 
Antibody assays may, in general, be homogeneous assays morphism. Preferred antibodies bind specifically to the 
or heterogeneous. In a homogeneous assay the immunologi- ^ TOicm or Polypeptide effected by the polymorphism but 
cal reaction usually involves the specific antibody, a labeled Wlth less affimt y lo me wlW -type Hox protein, 
analyle, and the sample of interest. The signal arising from In ODC embodiment, the antibody is a monoclonal an ti- 
the label is modified, directly or indirectly, upon the binding 40 body. For use in an immunoassay, the antibody can be bound 
of the antibody to the labeled analyte. Both the immuno- 10 a solid support or bound to a detectable label, 
logical reaction and detection of the extent thereof are EXAMPLES 
carried out in a homogeneous solution. Immunochemical ' ' 
labels which may be employed include free radicals, Example 1 
radioisotopes, fluorescent dyes, enzymes, bacteriophages, 45 Collection of Blood Samples from Autistic Individuals 
coenzymes, and so forth. Blood was collected from patients with autism and their 
In a heterogeneous assay approach, the reagents are immediate family members in order to determine whether 
usually the specimen, the antibody of the invention and any polymorphisms in HoxAl are present among this popu- 
means for producing a detectable signal. Similar specimens lation. All blood samples were procured following written 
as described above may be used. The antibody is generally 50 consent by the patients or their guardians. Among the 
immobilized on a support, such as a bead, plate, or slide, and samples collected were those of the members of a family of 
contacted with the specimen suspected of containing the four in which one child has autism and the other has 
antigen in a liquid phase. The support is then separated from Asperger's syndrome; both children have malformed ears, 
the liquid phase and either the support phase or the liquid The first son is retarded and the second has normal intelli- 
phase is examined for a detectable signal employing means 55 gence. The parents have no obvious symptoms. DNA was 
for producing such signal. The signal is related to the extracted from the blood by phenolchloroform extraction 
presence of the analyte in the specimen. Means for produc- following isolation and lysis of the white blood cells, 
ing a detectable signal include the use of radioactive labels, Control DNA was also used for these excrements; this DNA 
fluorescent labels, enzyme labels, and so forth. For example, was obtained from neurologically normal donors, 
if the antigen to be detected contains a second binding site, 60 The 20 cc blood samples were left for three-four days at 
an antibody which binds lo that site can be conjugated to a room temperature to allow continued proliferation of white 
detectable group and added to the liquid phase reaction blood cells. White cells were pelleted, followed by isolation 
solution before the separation step. The presence of the of the nuclei. The nuclei were then incubated overnight at 
detectable group on the solid support indicates the presence 37° C. in a lysis buffer consisting of EDTA, TNE-SDS, and 
of the antigen in the test sample. Examples of suitable 65 proteinase K. Protein contaminants were extracted by addi- 
immunoassays are the radioimmunoassay, immunofluores- tions of buffered phenol followed by chloroform, then DNA 
cence methods, enzyme-linked immunoassays, and the like. was precipitated by the addition of ice-cold ethanol. The 
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DNA was resuspended in TE buffer for storage at 4° C. Example 3 

Extraction of genomic DNA from fixed tissue was carried Sequencing the PCR Products 

out using the protocol of Volkenandt et al., Methods in Since sequencing PCR products allows the DNA 

Molecular Biology, 15, 81, Humana Press, (1993), which is sequence to be read from both alleles, a sequence with 

hereby incorporated by reference). 5 double bands suggests heterozygosity— that the two alleles 

Example 2 are no1 me same and that two different sequences superim- 

Scquencing the Hoxal Gene posed on one another are being read. Based on these results, 

The HoxAl gene was amplified by PCR from DNA the PCR products were cloned in order to get a cleaner 

samples to provide sufficient material for sequencing. Two sequence. Cloning separates the two alleles and allowed 

sets of oligonucleotide primers were selected after exami- io each to be individually sequenced to determine whether one 

nation of the human HoxAl nucleic acid sequence and or both alleles are abnormal. 

comparison of the sequence to those of human and mouse The PCR products were cloned using Invitrogen's Zero 
Hox genes. The first set was designed to amplify residues Blunt PCR Cloning Kit. This kit is designed to clone 
10-647, the second to amplify from residue 656 to the stop blunt-ended PCR fragments, which can be generated by 
codon at residue 1008, exons 1 and 2 of HoxAl, respec- 15 using a thermostable DNA polymerase with proofreading 
lively. The primers were used in polymerase chain reaction activity. Once the products were cloned, the clonal DNA was 
to amplify the target gene in several control blood samples, sequenced using the Sequenase version 2.0 chain- 
in order to determine the appropriate PCR conditions. Both termination sequencing system. Each clone was sequenced 
exons were amplified by 94° C. denaturation for 1 min, 62° in both 5' and 3* directions, and the reactions were run out 
C, annealing for 30 sec, and 72° C. extension for 2 min, for 20 for 6 hours on a 6% poly aery lamide sequencing gel. 
35 cycles. The products were visualized with ethidium Cloning allowed the determination that three out of four 
bromide staining on a 1-2% agarose gel. PhiX174 RF members of thus family are indeed heterozygous for Hox A 
DNA/Hae III fragments (Gibco) were used as a molecular 1. The father and both children contain an identical mutation 
weight marker. The products were tested for chromosome in the gene: a single base-pair change of A to G in the first 
origin by using human-rodent monochromosomal somatic 25 exon of the gene; the mother's gene is normal. This mutation 
cell hybrids. Both exons amplified by the HoxAl primers is dominant with variable penetrance. Sequences showing 
amplified the hybrid containing human chromosome 7 and the mutation can be seen in FIG. 1. FIG. 1A shows the 
do not amplify from any other hybrids. Establishing that the wild-type sequence. Substitution of guanine for adenine at 
product amplified by the primers is from the correct chro- this single location as shown in FIG. IB causes an alteration 
mosome rules out the possibility that pseudogenes with the 30 in the resulting amino acid sequence, changing a histidine to 
same sequence occur at other sites or that the amplified an arginine. 
product is another homologous homeobox gene. It verifies F 
that the PCR product represents only the targeted gene. D , . . a i * c SS"? j 

The polymerase chain reaction (PCR) was performed with A ™5j"» f , , . . 

various samples of control DNA in order to determine the 35 ™\ FCR Products from this famUy were also subjected* 

appropriate conditions. Once the optimal conditions were ^estnction enzyme digestion to confirm the mutation. The 

pertained, the gene was amplified from the patient — "ph I r„ ^^—3^^ 

Following PCR, an aliquot of the product was used for ! his . enz y me ; * ™» ^ecul; however, when mutated HoxAl 

DNA sequencing using the Sequenase system version 2.0 40 Jf dl g e f d « » ^ not be cut, because the recognition site has 

(UnitedStates Biochemical), which is a chain-termination *™ cha "^ b * * e m f utat10 ?- « ^ has been 

method of DNA sequencing The following procedure was 10 P F R P r ° du f frora ,h * and C u 0nfi ™ that ? he 

used to read the nucleic acid sequence of the amplified mu * t10 " exist ™ ™m and the childrei .but 

products. 7 fi\ of PCR product was mixed with 2 fil shrimp DOt f l t he f mother - ™* **** ™* to digest PCR 

alkaline phosphatase and 0.5 fil exonuclease I. Themixture 45 P^ CtS ^"Approximately 100 controk, 36 parent pairs, 

was incubated at 37° C. for 15 min and then at 80° C. for 15 26 f ecl f ed relat ; ves ! and <f P"* an *- In fort y ca f s ' f e 

min. After addition of 1 fA of primer, the mixture was ? sults ? f ^ restncllon h £ been compared to that 

incubated at 100' C. for 3 min and then chilled on ice for 5 *?m "J 0 encm S reactl0ns - ™ e ^° melhods 6 ave 

min. Next, the sample was incubated for 5 min at room ldentlcal resultS in ™V case ' 
temperature with the following additions: 2 fA 5x buffer, 1 50 Example 5 

fi\ DTT, 2 fA diluted dGTP, 0.5 /d 35 S-dATP, and 2 fA diluted Sequencing of a Polymorphism in HoxBl 
Sequenase buffer. A 3.5 fil aliquot of the mixture was then The sequence for the HoxBl gene (accession number 

added to 1 fA of one dideoxyNTP After 5 min at 37° C, 4 X16666) was obtained from the Entrez data base. From this 

fil of stop solution was added to the tube. The products were sequence primers for the amplification of a 575 bp product 

run on a 6% polyacry lamide sequencing. gel for 2-A hr. 55 of exon 1 by PCR were designed (Sense: 

Following this, the gel was dried on a BioRad gel dryer and S'-GCATGGACrATAATAGGATG-J (SEQ. ID. No. 9); 

exposed to film overnight. Film was developed on a Kodak Antisense: 5 -TCTrGGGTGGGTTTCTCTTA-3' (SEQ. ID. 

M35A X-OMAT Processor. The method has been used No. 10)). The final concentration of the following compo- 

successfully to duplicate the published sequence of the nents were used in the amplification reaction: 1.5 U Taq 

Hoxal exons in samples from a number of controls. The film 60 polymerase; 200 fiM each of dATP, dCTP, dGTP, dTTP; IS 

was developed the next afternoon, and the DNA sequence mM MgCl; 0.4 mM of each sense and antisense primer; 

was read manually for comparison to the published Hox Al 50-100 ng DNA template; and distilled H 2 0 to a final 

sequence. volume of 25 fil The Taq, dNTPs and MgCl are supplied in 

The nucleotide sequence from some patients, including a Ready-To-Go PCR Bead (Pharmacia 27-9555-01) and 

the members of the family mentioned previously, showed 65 were used according to manufacturer's directions. The PCR 

the presence of two discrete bands at the same levels on the reaction was carried out in a Perkin-Elmer 480 GeneAmp or 

gel- a Perkin-Elmer 2400 thermocycler. Reaction conditions 
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were: denaturing for 1 minute at 94° C, and then 35 cycles probands have one deviation from the normal sequence or 
of denaturing at 94° C. for 45 sec, annealing at 57°C. for 45 the other, while only 47% of historical controls have an 
sec, and elongation at 72° C. for 45 sec. Resulting PCR anomaly. Parent controls (untranslated alleles) match the 
product was analyzed on a 1% agarose gel and compare to historical controls in their rate of abnormal alleles, indicat- 
a 100 bp ladder to determine the size of the product. Since 5 mg mat the local population is not structured differently 
the size of the product was as expected (575 bp) and somatic from the genera i population in its rate of these alleles. In 
cell hybrid results indicated that the product is specific for contrast, both probands (x 2 =14.83, p<0.001) and other 
chromosome 17 DNA samples from probands, family mem- affected family members (^.6.30, p<0.02) differ signifi- 
bers and controls were amplified and sequenced using a canUy from bist orical controls. The probands differ signifi- 
radiolabeled terminator cycle sequencing kit (Amersham 10 cantly from lne parem controls, ^ wel i ^-4.08, p<0.05). 
Life Science US79750). The sequencing reaction was ran on ^ probands ^ genetic anomalies of HoxAl or HoxBl 
a 6% acrylamide sequencing gel (National Diagnostics) and are concordant with the other affected members of the family 
exposed to Kodak Biomax MS X ray film for 24-48 hours. m l8/22 cases (x 2 -17.82, p<0.001). Finally, both the HoxAl 
After developing the film, the resulting sequence was com- and HoxB1 polymorphisms are significantly associated with 
pared to the published sequence found in the Entrez data 1S autism as judged by the Transmission Disequilibrium Test 
ha * se - for Association (Spielman and Ewens, 1996), which corn- 
Example 6 pares the rate of transmission "into the disease" to the 50% 
Association of the Newly-discovered Alleles with Autism rate one would expect in offspring of parents with the allele 
Spectrum Disorders of interest. The % s for this test are: HoxAl =5.16, p<0.05; 

Forty-six probands with autistic spectrum disorders and 20 HoxBl-4.67, p<0.05. 

evidence of genetic causation were selected for analysis. In addition to the living probands, it was of interest to 

Forty-three had one or more other affected family members determine the genotype of the patient whose brain anatomy 

and thirty-five had ear anomalies or neurological deficits first suggested the involvement of the Hox genes in autism 

consistent with malfunction of HoxAl or its paralogs. For (Rodier et al., 1996). Genomic DNA was extracted from the 

comparison, three other groups were tested: 25 autopsy tissue, and the patient was determined to have the 

1) An unstructured control group consisting of adults with Bl polymorphism (Stodgell et al., 1998). 

no evidence of neurological abnormality collected from One proband is homozygous for the less common allele of 
many different medical centers. These were mostly spouses HoxAl, and he is severely affected. He was diagnosed early, 
of patients with late onset degenerative diseases of the at 21 months. None of the historic controls, and no parents, 
nervous system. The purpose of this group was to determine 30 were homozygous for the polymorphism. Homozygosity of 
the frequency of the alleles in the general population. the HoxBl polymorphism occurred in two historic controls, 

2) Parent controls — While each of the parents of a one affected parent, and in two severely-affected probands, 
proband obviously transmits half of his or her genetic Larger samples are needed to determine whether either 
material to the proband, imaginary individuals with two polymorphism reduces viability. Three probands have both 
alleles constructed from the untransmitted allele of each 35 polymorphisms, and are severely disabled. The detection 
parent pair should give an accurate estimate of the frequency and description of the polymorphisms in the first exons of 
of the alleles in the study population, aside from those HoxAl and HoxBl and the progress of the association 
transmitted to the probands. Thus, the untransmitted alleles studies have been described in a book chapter and two 
of the parent pair make a more stringent control, taking into abstracts (Rodier, 1998; Ingram et al., 1997; Stodgell et al., 
account known and unknown structure in the local popula- 40 1998). 

Example 7 

3) Affected family members of probands-When they lden tification of a Second Polymorphism in HoxAl 

were available, the siblings, cousins, parents, or aunts and Athird polvmop hism has been detected in the homeobox 
uncles of probands diagnosed with autism spectrum disor- re ^ 0Q of HoxA1 

in the second exon. The second exon 

ders or related symptoms (e.g. learning disabilities, language 45 cannQt be e(J pcR ^ ^ DNAof fouf ^ 

delays, neurological anomalies of the cranial nerves) were indicating that an anomaly exists. This indicates that they are 
tested. If an allele is associated with autism, it should be ho m 0zygous for a deviation from the published sequence on 
more frequent in probands and affected family members wfaich ^ primef5 for ^ QXQn W£re based pCR Jifi _ 
than in historic or parent controls. ^ cation yidds suggest tQat about ten other pro5ands are 

heterozygotes for this polymorphism of the second exon of 

TABLE 1 HoxAl. 

Percent of individuals with polymorphic forms of HoxAl and/or Bi Additional primers have been developed that will allow 

complete sequencing of the altered region, which appears to 
HOXAi iioxbi hoxai or noxBi ^ be at the 3' end of the homeobox. Once the sequence is 
established, a test (such as the use of restriction length 
polymorphisms) can be developed to allow rapid evaluation 
of DNA samples. The degree of association of this poly- 
morphism with autism spectrum disorders will then be 
studied in the same groups already evaluated for the others, 
ooi Other studies in progress are designed to examine the second 

different from probands: t - < 05 exon of HoxBl and the non-coding regions of both genes. 

Table 1 demonstrates that parent controls are, indeed, Example 8 

similar to historic controls in their rates of the polymor- Identification of Additional Polymorphisms in HoxBl and 

phisms under study, while affected family members are 65 HoxDl Associated with Autism 

similar to probands. This is especially true when the two The procedures for evaluating the candidate gene HoxDl, 

functionally-related genes are combined. Eighty percent of as well as for finding additional polymorphisms in HoxAl 



Historic controls (N - 101) 


16 


34 


47 


Paienl controls (N - 36) 


22 


39 


55T 


Probands with ASD (N - 46) 


35" 


52* 


80... 


Other affected relatives (N - 24) 


38* 


42 


75* 


different from historical controls: " 


- p < .05, 


- p < .01, 


"*-p< 
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and HoxBl, will be the same as for those already identified 
in HoxAl and HoxBl. Mutation detection in the coding 
sequence of these genes will consist of PCR amplification, 
cloning and sequencing. Mutation detection for the entire 
genes will include large deletion/insertion analysis by 
Southern blotting, analysis of 200-400 bp fragments by 
SSCP or heteroduplex analysis, and of course cloning and 
sequencing when heterozygosity becomes apparent for any 
region of the genes. Current Protocols in Human Genetics 
(John Wiley & Sons, Inc.), Chapter 7, "Searching Candidate 
Genes for Mutations." 
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Biological samples already isolated from patients with 
autism which did not show any abnormalities in HoxAl or 
HoxBl will be screened for polymorphisms in HoxDl. 

Although preferred embodiments have been depicted and 
described in detail herein, it will be apparent to those skilled 
in the relevant art that various modifications, additions, 
substitutions, and the like can be made without departing 
from the spirit of the invention and these therefore are 
10 considered within the scope of the invention as defined in the 
claims which follow. 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(iii) NUMBER OF SEQUENCES: 10 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1008 base pairs 

(B) TYPE: nucleic acid 

(C) ST RAND ED NE SS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



ATGGACAATG 


CAAGAATGAA 


CTCCTTCCTG 


GAATACCCCA 


TACTTAGCAG TGGCGACTCG 


GO 


GGGACCTGCT 


CAGCCCGAGC 


CTACCCCTCG 


GACCATAGGA 


TTACAACTTT CCAGTCGTGC 


120 


GCGGTCAGCG 


CCAACAGTTG 


CGGCGGCGAC 


GACCGCTTCC 


TAGTGGGCAG GGGGGTGCAG 


180 


ATCGGTTCGC CCCACCACCA CCACCACCAC CACCATCACC 


ACCCCCAGCC GGCTACCTAC 


240 


CAGACTTCCG GGAACCTGGG GGTGTCCTAC TCCCACTCAA 


GTTGTGGTCC AAGC TATGGC 


300 


TCACAGAACT 


TCAGTGCGCC 


TTACAGCCCC 


TACGCGTTAA 


ATCAGGAAGC AGACGTAAGT 


360 


GGTGGGTACC 


CCCAGTGCGC 


TCCCGCTGTT 


TACTCTGGAA 


ATCTCTCATC TCCCATGGTC 


420 


CAGCATCACC 


ACCACCACCA 


GGGTTATGCT 


GGGGGCGCGG 


TGGGCTCGCC TCAATACATT 


480 


CACCACTCAT 


ATGGACAGGA 


GCACCAGAGC 


CTGGCCCTGG 


CTACGTATAA TAACTCCTTG 


540 


TCCCCTCTCC 


ACGCCAGCCA 


CCAAGAAGCC 


TGTCGCTCCC 


CCGCATCGGA GACATCTTCT 


600 


CCAGCGCAGA 


CTTTTGACTG 


GATGAAAGTC 


AAAAGAAACC 


CTCCCAAAAC AGGGAAAGTT 


660 


GGAGAGTACG 


GCTACCTGGG 


TCAACCCAAC 


GCGGTGCGCA 


CCAACTTCAC TACCAAGCAG 


720 


CTCACGGAAC 


TGGAGAAGGA 


GTTCCACTTC AACAAGTACC 


TGACGCGCGC CCGCAGGGTG 


780 


GAGATCGCTG 


CATCCCTGCA 


GCTCAACGAG 


ACCCAAGTGA 


AGATCTGGTT CCAGAACCGC 


B40 


CGAATGAAGC 


AAAAGAAACG 


TGAGAAGGAG 


GGTCTCTTGC 


CCATCTCTCC GGCCACCCCG 


900 


CCAGGAAACG 


ACGAGAAGGC 


CGAGGAATCC 


TCAGAGAAGT 


CCAGCTCTTC GCCCTGCGTT 


960 


CCTTCCCCGG 


GGTCTTCTAC 


CTCAGACACT 


CTGACTACCT 


CCCACTGA 


100B 



(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 335 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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-continued 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Asp Asn Ala Arg Met ABn Ser Phe Leu Glu Tyr Pro He Leu Ser 

15 10 15 

Ser Gly Asp Ser Gly Thr Cys Ser Ala Arg Ala Tyr Pro Ser Asp His 



Arg He Thr Thr Phe Gin Ser Cys Ala Val Ser Ala Asn Ser Cys Gly 

35 40 45 

Gly Asp Asp Arg Phe Leu Val Gly Arg Gly Val Gin He Gly Ser Pro 
50 55 60 

His His His His His His His His His His Pro Gin Pro Ala Thr Tyr 

65 70 75 80 

Gin Thr Ser Gly Asn Leu Gly Val Ser Tyr Ser His Ser Ser Cys Gly 



Pro Ser Tyr Gly Ser Gin Asn Phe ser Ala Pro Tyr Ser Pro Tyr Ala 
100 105 110 

Leu Asn Gin Glu Ala Asp Val Ser Gly Gly Tyr Pro Gin Cys Ala Pro 
115 120 125 

Ala Val Tyr Ser Gly Asn Leu Ser Ser Pro Met Val Gin His His His 
130 135 140 

His His Gin Gly Tyr Ala Gly Gly Ala Val Gly Ser Pro Gin Tyr He 
145 150 155 160 

His His Ser Tyr Gly Gin Glu His Gin Ser Leu Ala Leu Ala Thr Tyr 
165 170 175 

Asn Asn Ser Leu Ser Pro Leu His Ala Ser His Gin Glu Ala Cys Arg 
160 185 190 

Ser Pro Ala Ser Glu Thr Ser Ser Pro Ala Gin Thr Phe Asp Trp Met 
195 200 205 

Lys Val Lys Arg Asn Pro Pro Lys Thr Gly Lys Val Gly Glu Tyr Gly 
210 215 220 

Tyr Leu Gly Gin Pro Asn Ala Val Arg Thr Asn Phe Thr Thr Lys Gin 
225 230 235 240 

Leu Thr Glu Leu Glu Lys Glu Phe HiB Phe Asn Lys Tyr Leu Thr Arg 
245 250 255 

Ala Arg Arg Val Glu He Ala Ala Ser Leu Gin Leu Asn Glu Thr Gin 
260 265 270 

Val Lys He Trp Phe Gin Asn Arg Arg Met Lys Gin Lys Lys Arg Glu 
275 280 285 

Lys Glu Gly Leu Leu Pro He Ser Pro Ala Thr Pro Pro Gly Asn Asp 
290 295 300 

Glu Lys Ala Glu Glu Ser Ser Glu Lys Ser Ser Ser Ser Pro Cys Val 
305 310 315 320 

Pro Ser Pro Gly Ser Ser Thr Ser Asp Thr Leu Thr Thr Ser His 
325 330 335 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1008 base pairs 

(B) TYPE: nucleic acid 

(D) TOPOLOGY: linear 9 
(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
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-continued 



ATGG ACAATG 




CTCCTTCCTG 


GAATACC CCA 








CAGCCC GAGC 


CTACCCCTCG 


GACCATAGGA 


TTACAACTTT CCAGTCGTGC 






CCAACAGTTG 


CGGCGGCGAC 




TAGTGGGCAG GGGGGTGCAG 






C CCACCAC CA 


CCACCACCAC 


C AC CATC GC C 






CAGACTTCCG 


GGAACCTGGG 


GGTGTCCTAC 


TCC CACTCAA 






TCACAGAACT 


T CAGTGCGCC 


TTACAGCCCC 


T AC GCGTTAA 


ATCAGGAAGC AGACGTAAGT 












ATCTCTCATC T CCCATGGTC 




CAGCATCACC 


ACC ACC AC CA 




GGGGGCGCGG 


TGGGCTCGCC TCAATACATT 




CACCACTCAT 








f*TIn"V*TATMl Tft Af^^ffT^Vn 




TCCCCTCTCC 


ACGCCAGCCA 


CCAAGAAGCC 


TGTCGCTCCC 


CCGCATCGGA G AC ATCTTC T 


600 


CCAGCGCAGA 


CTTTTGACTG 


GATGAAAGTC 


AAAAGAAACC 


CTCCCAAAAC AGGGAAAGTT 


660 


GGAGAGTACG 


GCTACCTGGG 


TCAACCCAAC 


GCGGTGCGCA 


CCAACTTCAC TACCAAGCAG 


720 


CTCACGGAAC 


TGGAGAAGGA 


GTTCCACTTC 


AACAAGTACC 


TGACGCGCGC CCGCAGGGTG 


780 


GAGATCGCTG 


CATCCCTGCA 


GCTCAACGAG 


ACCCAAGTGA 


AGATCTGGTT CCAGAACCGC 


840 


CGAATGAAGC 


AAAAGAAACG 


TGAGAAGGAG 


GGTCTCTTGC 


CCATCTCTCC GGCCACCCCG 


900 


CCAGGAAACG 


ACGAGAAGGC 


CGAGGAATCC 


TCAGAGAAGT 


CCAGCTCTTC GCCCTGCGTT 


960 


CCTTCCCCGG 


GGTCTTCTAC 


CTCAGACACT 


CTGACTACCT 


CCCACTGA 


1008 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS j 

(A) LENGTH: 335 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Asp Asn Ala Arg Met Asn Ser Phe Leu Glu Tyr Pro lie Leu Ser 
1 5 10 15 

Ser Gly Asp Ser Gly Thr Cys Ser Ala Arg Ala Tyr Pro Ser Asp His 
20 25 30 

Arg He Thr Thr Phe Gin Ser Cys Ala Val Ser Ala Asn Ser Cys Gly 
35 40 45 

Gly Asp Asp Arg Phe Leu Val Gly Arg Gly Val Gin He Gly Ser Pro 
50 55 60 

His His His His His His His Hie Arg His Pro Gin Pro Ala Thr Tyr 
65 70 75 80 

Gin Thr Ser Gly Asn Leu Gly Val Ser Tyr Ser His Ser Ser Cys Gly 
85 90 95 

Pro Ser Tyr Gly Ser Gin Asn Phe Ser Ala Pro Tyr Ser Pro Tyr Ala 
100 105 ~ 110 

Leu Asn Gin Glu Ala Asp Val Ser Gly Gly Tyr Pro Gin Cys Ala Pro 
115 120 125 

Ala Val Tyr Ser Gly Asn Leu Ser Ser Pro Met Val Gin His His His 
130 135 140 

Hia His Gin Gly Tyr Ala Gly Gly Ala Val Gly Ser Pro Gin Tyr He 
145 150 155 160 



His His 



Ser Tyr Gly Gin Glu His Gin Ser Leu Ala Leu 
165 170 



Ala Thr Tyr 
175 
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Asn Aan Ser Leu Ser Pro Leu His Ala Ser His Gin Glu Ala Cys Arg 
160 185 190 

Ser Pro Ala Ser Glu Thr Ser Ser Pro Ala Gin Thr Phe Asp Trp Met 
195 200 205 

Lys Val Lys Arg Asn Pro Pro Lys Thr Gly LyB Val Gly Glu Tyr Gly 
210 215 220 

Tyr Leu Gly Gin Pro Asn Ala Val Arg Thr Asn Phe Thr Thr Lys Gin 
225 230 235 240 

Leu Thr Glu Leu Glu Lys Glu Phe His Phe Asn Lys Tyr Leu Thr Arg 
245 250 255 

Ala Arg Arg Val Glu lie Ala Ala Ser Leu Gin Leu Asn Glu Thr Gin 
260 265 270 

Val Lys lie Trp Phe Gin Asn Arg Arg Met Lys Gin Lys Lys Arg Glu 
275 280 285 

Lys Glu Gly Leu Leu Pro lie Ser Pro Ala Thr Pro Pro Gly Aan Asp 
290 295 300 

Glu Lys Ala Glu Glu Ser Ser Glu Lys Ser Ser Ser Ser Pro Cys Val 
305 310 315 320 

Pro Ser Pro Gly Ser Ser Thr Ser Asp Thr Leu Thr Thr Ser His 
325 330 335 



(2) IN FORMAT ION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH s 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSi single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 

TGACGCATGG ACTATAATAG GATGAACTCC TTCTTAGAGT ACCCACTCTG TAACCGGGGA 60 

CCCAGCGCCT ACAGCGCCCA CAGCGCCCCA ACCTCCTTTC CCCCAAGCTC GGCTCAGGCG 120 

GTTGACAGCT ATGCAAGCGA GGGCCGCTAC GGTGGGGGGC TGTCCAGCCC TGCGTTTCAG 180 

CAGAACTCCG GCTATCCCGC CCAGCAGCCG CCTTCGACCC TGGGGGTGCC CTTCCCCAGC 24 0 

TCCGCGCCCT CGGGGTATGC TCCTGCCGCC TGCAGCCCCA GCTACGGGCC TTCTCAGTAC 300 

TACCCTCTGG GTCAATCAGA AGGAGACGGA GGCTATTTTC ATCCCTCGAG CTACGGGGCC 360 

CAGCTAGGGG GCTTGTCCGA TGGCTACGGA GCAGGTGGAG CCGGTCCGGG GCCATATCCT 420 

CCGCAGCATC CCCCTTATGG GAACGAGCAG ACCGCGAGCT TTGCACCGGC CTATGCTGAT 480 

CTCCTCTCCG AGGACAAGGA AACACCCTGC CCTTCAGAAC CTAACACCCC CACGGCCCGG 54 0 

ACCTTCGACT GGATGAAGGT TAAGAGAAAC CCACCCAAGA CAGCGAAGGT GTCAGAGCCA 600 

GGCCTGGGCT CGCCCAGTGG CCTCCGCACC AACTTCACCA CAAGGCAGCT GACAGAACTG 660 

GAAAAGGAGT TCCATTTCAA CAAGTACCTG AGCCGGGCCC GGAGGGTGGA GATTGCCGCC 720 

ACCCTGGAGC TCAATGAAAC ACAGGTCAAG ATTTGGTTCC AGAACCGACG AATGAAGCAG 780 

AAGAAGCGCG AGCGAGAGGG AGGTCGGGTC CCCCCAGCCC CACCAGGCTG CCCCAAGGAG 840 

GCAGCTGGAG ATGCCTCAGA CCAGTCGACA TGCACCTCCC CGGAAGCCTC ACCCAGCTCT 900 

GTCACCTCCT GAACTGAACC TAGCCACCAA TGGGGCTTCC AGGCACTGGA GCGCCCCAGT 960 

CCAGCCCTAT CCCAGGCTCT CCCAACCCAG GCCTGGCTTC ACTGCCTGGG ATCTCTAGGC 1020 



US 6,228,582 Bl 
31 32 



-continued 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 301 amino acids 

(B ) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

Met Asp Tyr Asn Arg Met Asn Ser Phe Leu Glu Tyr Pro Leu Cys Asn 
15 10 15 

Arg Gly Pro Ser Ala Tyr Ser Ala HiB Ser Ala Pro Thr Ser Phe Pro 
20 25 30 

Pro Ser Ser Ala Gin Ala Val Asp Ser Tyr Ala Ser Glu Gly Arg Tyr 



Gly Gly Gly Leu Ser Ser Pro Ala Phe Gin Gin Asn Ser Gly Tyr Pro 

50 55 60 

Ala Gin Gin Pro Pro Ser Thr Leu Gly Val Pro Phe Pro Ser Ser Ala 

65 70 75 80 

Pro Ser Gly Tyr Ala Pro Ala Ala Cys Ser Pro Ser Tyr Gly Pro Ser 



Gin Tyr Tyr Pro Leu Gly Gin Ser Glu Gly Asp Gly Gly Tyr Phe His 
100 105 110 

Pro Ser Ser Tyr Gly Ala Gin Leu Gly Gly Leu Ser Asp Gly Tyr Gly 
115 120 125 

Ala Gly Gly Ala Gly Pro Gly Pro Tyr Pro Pro Gin His Pro Pro Tyr 
130 135 140 

Gly Asn Glu Gin Thr Ala Ser Phe Ala Pro Ala Tyr Ala Asp Leu Leu 
145 150 155 160 

Ser Glu Asp Lys Glu Thr Pro Cys Pro Ser Glu Pro Asn Thr Pro Thr 
165 170 175 

Ala Arg Thr Phe Asp Trp Met Lys Val Lys Arg Asn Pro Pro Lys Thr 
180 185 190 

Ala Lys Val Ser Glu Pro Gly Leu Gly Ser Pro Ser Gly Leu Arg Thr 
195 200 205 

Asn Phe Thr Thr Arg Gin Leu Thr Glu Leu Glu Lys Glu Phe His Phe 
210 215 220 

Asn Lys Tyr Leu Ser Arg Ala Arg Arg Val Glu lie Ala Ala Thr Leu 
225 230 235 240 

Glu Leu Asn Glu Thr Gin Val Lys lie Trp Phe Gin Asn Arg Arg Met 
245 250 255 

Lys Gin Lys Lys Arg Glu Arg Glu Gly Gly Arg Val Pro Pro Ala Pro 
260 265 270 

Pro Gly Cys Pro Lys Glu Ala Ala Gly Asp Ala Ser Asp Gin Ser Thr 
275 280 285 

Cys Thr Ser Pro Glu Ala Ser Pro Ser Ser Val Thr Ser 
290 295 300 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1030 base pairs 

(B ) TYPE: nucleic acid 

(C) STRANDEDNESS i single 

(D) TOPOLOGY: linear 
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<ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TGACGCATGG ACTATAATAG GATGAACTCC TTCTTAGAGT ACCCACTCTG TAACCGGGGA 60 

CCCAGCGCCT ACAGCGCCCA CAGCGCCCAC AGCGCCCCAA CCTCCTTTCC CCCAAGCTCG 120 

GCTCAGGCGG TTGACAGCTA TGCAAGCGAG GGCCGCTACG GTGGGGGGCT GTCCAGCCCT 180 

GCGTTTCAGC AGAACTCCGG CTATCCCGCC CAGCAGCCGC CTTCGACCCT GGGGGTGCCC 240 

TTCCCCAGCT CCGCGCCCTC GGGGTATGCT CCTGCCGCCT GCAGCCCCAG CTACGGGCCT 300 

TCTCAGTACT ACCCTCTGGG TCAATCAGAA GGAGACGGAG GCTATTTTCA TCCCTCGAGC 360 

TACGGGGCCC AGCTAGGGGG CTTGTCCGAT GGCTACGGAG CAGGTGGAGC CGGTCCGGGG 420 

CCATATCCTC CGCAGCATCC CCCTTATGGG AACGAGCAGA CCGCGAGCTT TGCACCGGCC 480 

TATGCTGATC TCCTCTCCGA GGACAAGGAA ACACCCTGCC CTTCAGAACC TAACACCCCC 540 

ACGGCCCGGA CCTTCGACTG GATGAAGGTT AAGAGAAACC CACCCAAGAC AGCGAAGGTG 600 

TCAGAGCCAG GCCTGGGCTC GCCCAGTGGC CTCCGCACCA ACTTCACCAC AAGGCAGCTG 660 

ACAGAACTGG AAAAGGAGTT CCATTTCAAC AAGTACC TGA GCCGGGCCCG GAGGGTGGAG 720 

ATTGCCGCCA CCCTGGAGCT CAATGAAACA CAGGTCAAGA TTTGGTTCCA GAACCGACGA 780 

ATGAAGCAGA AGAAGCGCGA GCGAGAGGGA GGTCGGGTCC CCCCAGCCCC ACCAGGCTGC 840 

CCCAAGGAGG CAGCTGGAGA TGCCTCAGAC CAGTCGACAT GCACCTCCCC GGAAGCCTCA 900 

CCCAGCTCTG TCACCTCCTG AACTGAACCT AGCCACCAAT GGGGCTTCCA GGCACTGGAG 960 

CGCCCCAGTC CAGCCCTATC CCAGGCTCTC CCAACCCAGG CCTGGCTTCA CTGCCTGGGA 1020 

TCTCTAGGCT 1030 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 amino acids 

(B ) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:8i 

Met Asp Tyr Asn Arg Met Asn Ser Phe Leu Glu Tyr Pro Leu Cya Aan 
15 10 15 

Arg Giy Pro Ser Ala Tyr Ser Ala His Ser Ala Hie Ser Ala Pro Thr 
20 25 30 

Ser Phe Pro Pro Ser Ser Ala Gin Ala Val Asp Ser Tyr Ala Ser Glu 
35 40 45 

Gly Arg Tyr Gly Gly Gly Leu Ser Ser Pro Ala Phe Gin Gin Asn Ser 



Gly Tyr Pro Ala Gin Gin Pro Pro Ser Thr Leu Gly Val Pro Phe Pro 

65 70 75 80 

Ser Ser Ala Pro Ser Gly Tyr Ala Pro Ala Ala Cys Ser Pro Ser Tyr 
85 90 95 

Gly Pro Ser Gin Tyr Tyr Pro Leu Gly Gin Ser Glu Gly Asp Gly Gly 
100 105 110 

Tyr Phe His Pro Ser Ser Tyr Gly Ala Gin Leu Gly Gly Leu Ser Asp 
115 120 125 

Gly Tyr Gly Ala Gly Gly Ala Gly Pro Gly Pro Tyr Pro Pro Gin His 



35 
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Pro Pro Tyr Gly Asn Glu Gin Thr Ala Ser Phe Ala Pro Ala Tyr Ala 
145 150 155 160 

Asp Leu Leu Ser Glu Asp Lye Glu Thr Pro Cys Pro Ser Glu Pro Asn 
165 170 175 

Thr Pro Thr Ala Arg Thr Phe Asp Trp Met Lye Val Lye Arg Asn Pro 
180 185 190 

Pro Lys Thr Ala Lys Val Ser Glu Pro Gly Leu Gly Ser Pro Ser Gly 
195 200 205 

Leu Arg Thr Asn Phe Thr Thr Arg Gin Leu Thr Glu Leu Glu Lye Glu 
210 215 220 

Phe Hi 8 Phe Asn Ly B Tyr Leu Ser Arg Ala Arg Arg Val Glu lie Ala 
225 230 235 240 

Ala Thr Leu Glu Leu Asn Glu Thr Gin Val Lys He Trp Phe Gin Asn 
245 250 25S 

Arg Arg Met Lys Gin Lys Lys Arg Glu Arg Glu Gly Gly Arg Val Pro 
260 265 270 

Pro Ala Pro Pro Gly Cys Pro Lys Glu Ala Ala Gly Asp Ala Ser Asp 
275 280 285 

Gin Ser Thr Cys Thr Ser Pro Glu Ala Ser Pro Ser Ser Val Thr Ser 
290 295 300 



(2) INFORMATION FOR SEQ ID N0s9: 

(i) SEQUENCE CHARACTERISTICS l 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "primer" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

GCATGGACTA TAATAGGATG 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "primer" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

TCTTGGGTGG GTTTCTCTTA 



What is claimed: 

1. A method for screening subjects for genetic markers 
associated with autism, comprising: 

isolating a biological sample from a mammal; and 
testing the sample or genetic material isolated from the 60 
sample for a polymorphism in a Hox Al or Bl coding 
sequence which is a genetic marker for autism. 

2. The method according to claim 1, wherein the biologi- 
cal sample is selected from the group consisting of blood, 
saliva, amniotic fluid, and tissue. 65 

3. The method according to claim 2, wherein the biologi- 
cal sample is blood. 



4. The method according to claim 1, wherein the mammal 
is a human. 

5. The method according to claim 4, wherein the biologi- 
cal sample is isolated from development ally disabled chil- 
dren. 

6. The method according to claim 4, wherein the biologi- 
cal sample is isolated from parents or relatives of develop- 
mentally disabled children. 

7. The method according to claim 4, wherein the biologi- 
cal sample is isolated from children and said method further 
comprises: 
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early behavior training for children having genetic mark- 19. The method according to claim 15, wherein said 

ere associated with autism. screening is carried out by ligase chain reaction. 

8. The method according to claim 1, wherein the poly- 20. The method according to claim 15, wherein said 
morphism is located in the homeobox. screening is carried out by enzymatic cleavage. 

9. The method according to claim 1, wherein the coding 5 21. The method according to claim 15, wherein said 
sequence has a single base substitution resulting in an amino screening is carried out by southern hybridization. 

acid substitution. ^ 22. The method according to claim 15, wherein the 

10. The method according to claim 9, wherein the amino Qudeic acid a a deoxyribonucleic acid . 

acid substitution !S an argmme for a histidme. ^ ^ method &ccord - t0 claim 15> wherein the 

11. The method I according toclaim 10, wherein thecoding ,0 ^ ^ & m er ribonucleic acid . 
sequence has an insertion. . . . t , .° ., . . t . 

12. The method according to claim 11, wherein the ^ lsolated nucleic acid molecule compnsmg the 
insertion is 5' ACAGCGCCC-3' nucleotide sequence set forth in SbQ ID NO: 1, wherein the 

13. The method according to claim 1, wherein the coding nucleic acid molecule comprises a single base substitution at 
sequence has a polymorphism selected from the group 15 oucleollde 218 - 

consisting of a single base substitution resulting in an amino 25 - ^ isolated nucleic acid molecule comprising the 

acid substitution, a single base substitution resulting in a nucleotide sequence set forth in SEQ ID NO:5, wherein the 

translational stop, an insertion, a deletion, and a rearrange- nucleic acid molecule comprises an insertion between nucle- 

ment. otides 88 and 89. 

14. The method according to claim 1, wherein the poly- 20 26. The isolated nucleic acid molecule according to claim 
morphism alters the sequence of the polypeptide encoded by 25, wherein the insertion is 5-ACAGCGCCC-3\ 

the coding sequence. 27. An isolated nucleic acid molecule consisting of at least 

15. The method according to claim 1, wherein said 15 contiguous nucleotides of the coding sequence set forth 
screening for mutated nucleic acids is carried out by a in SEQ ID NO:5 wherein the molecule comprises an inser- 
method selected from the group consisting of direct 25 tion between nucleotides 88 and 89 in SEQ ID NO:5 and 
sequencing of nucleic acids, single strand polymorphism wherein the molecule specifically binds to a HoxAl or 
assay, restriction fragment length polymorphism assay, IIoxBl coding sequence but does not bind to other coding 
ligase chain reaction, enzymatic cleavage and southern sequences. 

hybridization. 28. An isolated nucleic acid molecule consisting of at least 

16. The method according to claim 15, wherein said 30 15 contiguous nucleotides of the coding sequence set forth 
screening is carried out by direct sequencing of nucleic in SEQ ID NO: 1 wherein the molecule comprises a single 
acids. base substitution at nucleotide 218 and wherein the molecule 

17. The method according to claim 15, wherein said specifically binds to a HoxAl or HoxBl coding sequence 
screening is carried out by single strand polymorphism but does not bind to other coding sequences. 

assay. 35 29. The method according to claim 1 wherein the coding 

18. The method according to claim 15, wherein said sequence has a mutation in an exon. 
screening is carried out by restriction fragment length poly- 
morphism assay. * * * + * 
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